301 research outputs found

    Efficient Bit-parallel Multiplication with Subquadratic Space Complexity in Binary Extension Field

    Get PDF
    Bit-parallel multiplication in GF(2^n) with subquadratic space complexity has been explored in recent years due to its lower area cost compared with traditional parallel multiplications. Based on \u27divide and conquer\u27 technique, several algorithms have been proposed to build subquadratic space complexity multipliers. Among them, Karatsuba algorithm and its generalizations are most often used to construct multiplication architectures with significantly improved efficiency. However, recursively using one type of Karatsuba formula may not result in an optimal structure for many finite fields. It has been shown that improvements on multiplier complexity can be achieved by using a combination of several methods. After completion of a detailed study of existing subquadratic multipliers, this thesis has proposed a new algorithm to find the best combination of selected methods through comprehensive search for constructing polynomial multiplication over GF(2^n). Using this algorithm, ameliorated architectures with shortened critical path or reduced gates cost will be obtained for the given value of n, where n is in the range of [126, 600] reflecting the key size for current cryptographic applications. With different input constraints the proposed algorithm can also yield subquadratic space multiplier architectures optimized for trade-offs between space and time. Optimized multiplication architectures over NIST recommended fields generated from the proposed algorithm are presented and analyzed in detail. Compared with existing works with subquadratic space complexity, the proposed architectures are highly modular and have improved efficiency on space or time complexity. Finally generalization of the proposed algorithm to be suitable for much larger size of fields discussed

    The Natural Ecology and Stock Enhancement of the Edible Jellyfish (Rhopilema esculentum Kishinouye, 1891) in the Liaodong Bay, Bohai Sea, China

    Get PDF
    Among the edible jellyfish species, Rhopilema esculentum Kishinouye, 1891, is one of the most abundant jellyfish species consumed. Therefore, this jellyfish species is an important fisheries source in China. The jellyfish fisheries in China show annually considerable fluctuations and have a very short season. In the chapter, we firstly try to review the natural ecology of R. esculentum, which includes the distribution and migration, growth model, and survival rate in the Liaodong Bay (LDB) based on the results of our field studies for more than 20 years. Secondly, we focus on reviewing the jellyfish fishery and population dynamic in the LDB. Thirdly, we emphasize the themes, including the survey methods, catch prediction, enhancement assessment, and fishery management, based on our survey results from 2005 to 2010. Finally, we present our field and experiment results of resource restoration. The high commercial value of R. esculentum enhancement in the LDB has made this a very successful enterprise

    DDC-PIM: Efficient Algorithm/Architecture Co-design for Doubling Data Capacity of SRAM-based Processing-In-Memory

    Full text link
    Processing-in-memory (PIM), as a novel computing paradigm, provides significant performance benefits from the aspect of effective data movement reduction. SRAM-based PIM has been demonstrated as one of the most promising candidates due to its endurance and compatibility. However, the integration density of SRAM-based PIM is much lower than other non-volatile memory-based ones, due to its inherent 6T structure for storing a single bit. Within comparable area constraints, SRAM-based PIM exhibits notably lower capacity. Thus, aiming to unleash its capacity potential, we propose DDC-PIM, an efficient algorithm/architecture co-design methodology that effectively doubles the equivalent data capacity. At the algorithmic level, we propose a filter-wise complementary correlation (FCC) algorithm to obtain a bitwise complementary pair. At the architecture level, we exploit the intrinsic cross-coupled structure of 6T SRAM to store the bitwise complementary pair in their complementary states (Q/QQ/\overline{Q}), thereby maximizing the data capacity of each SRAM cell. The dual-broadcast input structure and reconfigurable unit support both depthwise and pointwise convolution, adhering to the requirements of various neural networks. Evaluation results show that DDC-PIM yields about 2.84×2.84\times speedup on MobileNetV2 and 2.69×2.69\times on EfficientNet-B0 with negligible accuracy loss compared with PIM baseline implementation. Compared with state-of-the-art SRAM-based PIM macros, DDC-PIM achieves up to 8.41×8.41\times and 2.75×2.75\times improvement in weight density and area efficiency, respectively.Comment: 14 pages, to be published in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD

    PASNet: Polynomial Architecture Search Framework for Two-party Computation-based Secure Neural Network Deployment

    Full text link
    Two-party computation (2PC) is promising to enable privacy-preserving deep learning (DL). However, the 2PC-based privacy-preserving DL implementation comes with high comparison protocol overhead from the non-linear operators. This work presents PASNet, a novel systematic framework that enables low latency, high energy efficiency & accuracy, and security-guaranteed 2PC-DL by integrating the hardware latency of the cryptographic building block into the neural architecture search loss function. We develop a cryptographic hardware scheduler and the corresponding performance model for Field Programmable Gate Arrays (FPGA) as a case study. The experimental results demonstrate that our light-weighted model PASNet-A and heavily-weighted model PASNet-B achieve 63 ms and 228 ms latency on private inference on ImageNet, which are 147 and 40 times faster than the SOTA CryptGPU system, and achieve 70.54% & 78.79% accuracy and more than 1000 times higher energy efficiency.Comment: DAC 2023 accepeted publication, short version was published on AAAI 2023 workshop on DL-Hardware Co-Design for AI Acceleration: RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inferenc

    PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party Computation Based Private Inference

    Full text link
    The rapid growth and deployment of deep learning (DL) has witnessed emerging privacy and security concerns. To mitigate these issues, secure multi-party computation (MPC) has been discussed, to enable the privacy-preserving DL computation. In practice, they often come at very high computation and communication overhead, and potentially prohibit their popularity in large scale systems. Two orthogonal research trends have attracted enormous interests in addressing the energy efficiency in secure deep learning, i.e., overhead reduction of MPC comparison protocol, and hardware acceleration. However, they either achieve a low reduction ratio and suffer from high latency due to limited computation and communication saving, or are power-hungry as existing works mainly focus on general computing platforms such as CPUs and GPUs. In this work, as the first attempt, we develop a systematic framework, PolyMPCNet, of joint overhead reduction of MPC comparison protocol and hardware acceleration, by integrating hardware latency of the cryptographic building block into the DNN loss function to achieve high energy efficiency, accuracy, and security guarantee. Instead of heuristically checking the model sensitivity after a DNN is well-trained (through deleting or dropping some non-polynomial operators), our key design principle is to em enforce exactly what is assumed in the DNN design -- training a DNN that is both hardware efficient and secure, while escaping the local minima and saddle points and maintaining high accuracy. More specifically, we propose a straight through polynomial activation initialization method for cryptographic hardware friendly trainable polynomial activation function to replace the expensive 2P-ReLU operator. We develop a cryptographic hardware scheduler and the corresponding performance model for Field Programmable Gate Arrays (FPGA) platform

    Genome-wide association study of maize resistance to Pythium aristosporum stalk rot

    Get PDF
    Stalk rot, a severe and widespread soil-borne disease in maize, globally reduces yield and quality. Recent documentation reveals that Pythium aristosporum has emerged as one of the dominant causal agents of maize stalk rot. However, a previous study of maize stalk rot disease resistance mechanisms and breeding had mainly focused on other pathogens, neglecting P. aristosporum. To mitigate crop loss, resistance breeding is the most economical and effective strategy against this disease. This study involved characterizing resistance in 295 inbred lines using the drilling inoculation method and genotyping them via sequencing. By combining with population structure, disease resistance phenotype, and genome-wide association study (GWAS), we identified 39 significant single-nucleotide polymorphisms (SNPs) associated with P. aristosporum stalk rot resistance by utilizing six statistical methods. Bioinformatics analysis of these SNPs revealed 69 potential resistance genes, among which Zm00001d051313 was finally evaluated for its roles in host defense response to P. aristosporum infection. Through virus-induced gene silencing (VIGS) verification and physiological index determination, we found that transient silencing of Zm00001d051313 promoted P. aristosporum infection, indicating a positive regulatory role of this gene in maize’s antifungal defense mechanism. Therefore, these findings will help advance our current understanding of the underlying mechanisms of maize defense to Pythium stalk rot
    corecore